Creating an Order in Digital Libraries with Self - Organizing

نویسندگان

  • Samuel Kaski
  • Timo Honkela
  • Krista Lagus
چکیده

|Formulation of suitable search expressions for information retrieval from large full-text databases may currently require considerable eeorts. Changing the scope of the search when, e.g., too many or too few hits have been obtained, requires re-formulation of the search expression. For an alternative scheme we suggest an explorative full-text information retrieval method, where the Self-Organizing Map (SOM) algorithm is used to order documents based on their full textual contents. The visualized order can then be utilized for an explorative search or exploration of novel knowledge areas, whereby the scope can be changed interactively. The ordering of the documents is achieved by a two-level analysis: First, word categories are extracted from the text by a \semantic" SOM. Second, the textual context of the documents is encoded on the basis of the histograms of words formed on the word category map.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Creating an Order in Distributed Digital Libraries by Integrating Independent Self-Organizing Maps

Digital document libraries are an almost perfect application arena for un-supervised neural networks. This because many of the operations computers have to perform on text documents are classiication tasks based on \noisy" input patterns. The \noise" arises because of the known inaccuracy of mapping natural language to an indexing vocabulary representing the contents of the documents. A growing...

متن کامل

A Scalable Self-organizing Map Algorithm for Textual Classification: A Neural Network Approach to Thesaurus Generation

The rapid proliferation of textual and multimedia online databases, digital libraries, Internet servers, and intranet services has turned researchers' and practitioners' dream of creating an information-rich society into a nightmare of info-gluts. Many researchers believe that turning an info-glut into a useful digital library requires automated techniques for organizing and categorizing large-...

متن کامل

MinervaDL: An Architecture for Information Retrieval and Filtering in Distributed Digital Libraries

We present MinervaDL, a digital library architecture that supports approximate information retrieval and filtering functionality under a single unifying framework. The architecture of MinervaDL is based on the peer-to-peer search engine Minerva, and is able to handle huge amounts of data provided by digital libraries in a distributed and self-organizing way. The two-tier architecture and the us...

متن کامل

Organization of Distributed Digital Libraries: A Neural Network { Based Approach

Self-organizing maps are a popular neural network model for mapping high-dimensional input data onto a lower-dimensional output space. However, as the size of the training data increases, both the necessary computational power as well as the training time required exceed tolerable limits. Still more important, not all training data may be available in one central location but may rather be coll...

متن کامل

A Systematic Review of Data Mining Applications in Digital Libraries

Purpose: Study aimed to identify the applications of data mining in the provision of services, collection and management of digital libraries. Methodology: This is an applied study in terms of purpose and in terms of method is qualitative research that have been done by systematic review method. For this purpose, articles have been obtained by searching databases of Springer, Emerald, ProQuest,...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1996